This course provides a structured introduction to data science, covering essential concepts, tools, and techniques used for data analysis and modeling. It emphasizes practical applications using R and Python, enabling participants to develop proficiency in data manipulation, visualization, statistical modeling, and machine learning fundamentals. Designed with a hands-on approach, the course allows learners to apply theoretical knowledge to real-world datasets through interactive sessions and practical exercises. It is suitable for students, researchers, and professionals seeking to build foundational skills in data science and analytics. No prior experience in programming or statistics is required, making it accessible to beginners while still offering valuable insights for those with some background in the field.
π Day 1: Intro to Data Science & R Basics
Installing and Loading Packages, Scripts, R for Basic Maths, Matrices and Arrays, Lists and Data Frames.
π Day 2: Data Wrangling & Programming
Reading and writing files, Merging several data Sources, Data Structuring, Data Cleaning, Conditions and loops, Calling and writing functions.
π Day 3: Visualization & Graphical Excellence
Basic plotting, Advanced plotting using ggplot2, Tufte's principles of graphical excellence.
π Day 4: Statistical Testing & Modeling
Hypothesis Testing, Linear regression model, Analysis of Variance, Logistic regression model, Clustering, Dimension reduction (PCA, FA), Time series.
π Day 5: Use of Statistical Software for Statistical Modeling
Implementation of hypothesis testing, regression models, clustering, and dimension reduction techniques using statistical software. Hands-on practice with data visualization, model diagnostics, and interpretation of statistical outputs. Application of time series analysis for forecasting and trend identification.
π Day 6: Introduction to Python
Introduction to Python and Jupyter Notebook. Python Basics β Types, Expressions, and Variables, String Operations. Python Programming Fundamentals β Conditions and Branching, Loops, Functions, Objects, and Classes. Python Data Structures β Lists, Tuples, Sets, and Dictionaries. Data analysis, manipulation, and visualization in Python β Numpy, Pandas, Matplotlib, Seaborn.
π Day 7: Python for Data Science
Measures of Central Tendency and Variance, Bernoulli Distribution, Binomial Distribution, Poisson Discrete Distribution, Normal Distribution, Exploring Correlation in Python, Create a correlation matrix using Python, Pearsonβs Chi-Square Test.
π Day 8: Machine Learning Basics
Introduction to Machine Learning: Basic theory and principles, KNN, Decision Tree, SVM, etc.
Introduction to ANN: Human brain and ANN, Perceptron, Mathematics of NN, ANN architectures, Single-layer network, Multi-layer network, Types of ANN, Feed forward, Feed-backward, and Recurrent network.
Learning Algorithms: Learning Process, Supervised learning, Unsupervised learning, Reinforced learning, Deep learning.
Back-propagation in ANN: Gradient descent rule, Network training, and parameter optimization.
Applications: Use ANN for solving real-life problems in prediction, classification, and clustering tasks.
Medium of the instructions is both English and Sinhala.
Duration: One month (60 hours, held on Saturdays and Sundays)
Course Fee: LKR 30,000.00
Location: University of Ruhuna
In this course, we combine theory with hands-on practice. We will teach the basic concepts of data science and statistics in lectures, and then give students the opportunity to apply what they have learned using lab computers. Through practical exercises and real-world datasets, students will gain experience in using statistical software and programming languages like R and Python to solve problems and analyze data.
Participation Certificate: Maintain 80% attendance for lectures and practicals.
Successful Completion Certificate: Maintain 80% attendance and successfully complete all assignments.